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OPTIMAL SEQUENTIAL DETECTION IN 
MULTI-STREAM DATA 
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National University of Singapore 

Consider a large number of detectors each generating a data 
stream. The task is to detect online, distribution changes in a small 
fraction of the data streams. Previous approaches to this problem 
include the use of mixture likelihood ratios and sum of CUSUMs. We 
provide here extensions and modifications of these approaches that 
are optimal in detecting normal mean shifts. We show how the (opti¬ 
mal) detection delay depends on the fraction of data streams under¬ 
going distribution changes as the number of detectors goes to infiirity. 
There are three detection domains. In the hrst domain for moderately 
large fractions, immediate detection is possible. In the second do¬ 
main for smaller fractions, the detection delay grows logarithmically 
with the number of detectors, with an asymptotic constant extend¬ 
ing those in sparse normal mixture detection. In the third domain 
for even smaller fractions, the detection delay lies in the framework 
of the classical detection delay formula of Lorden. We show that the 
optimal detection delay is achieved by the sum of detectability score 
transformations of either the partial scores or CUSUM scores of the 
data streams. 


1. Introduction. Consider A data streams with the observation of 
the nth data stream at time t. We want to detect as quickly as we can a possi¬ 
ble change-point n > 1, such that for some AA C {1,..., N}, the post-change 
observations Xnt for n € A (and t > v) have distributions different from 
the pre-change observations. Applications for this multi-stream sequential 
change-point detection problem include hospital management, infectious- 
disease modeling and target detection. 

Tartakovsky and Veervallli [19] consider distributed decision-making and 
optimal fusion, with minimax, uniform and Bayesian formulations for se¬ 
quential detection in multi-stream data. Though optimal detection is achieved, 
the asymptotics involve N fixed as the average run lengths go to infinity. 

Mei [13] considers distribution changes that do not affect all data streams, 
and recommends a sum of CUSUM approach. The advantages of his ap¬ 
proach are that the distribution changes are not assumed to have occurred 
simultaneously, and the efficient computation of his stopping rule. However 

‘Suported by the National University of Singapore grant R-155-000-158-112 

1 

imsart-aos ver. 2011/11/15 file: seq-sparse8.tex date: January 20, 2016 



2 


HOCK PENG CHAN 


as has been shown in an earlier simulation study, the detection delay is rel¬ 
atively large when the number of data streams undergoing change, is 
small. 

Xie and Siegmund [20] are the first to look from the perspective of 
small. They suggest a mixture likelihood ratio (MLR) approach and show 
via simulation studies the superiority of their MLR stopping rules in detect¬ 
ing over a wide range of compared to other known approaches. They 
also provide analytical approximations to average run lengths and detection 
delays of their stopping rules that are accurate and useful. However they do 
not give any small or moderate optimality theory. 

In parallel developments, motivated by applications in DNA copy-number 
samples, there have been advances made, see Siegmund, Yakir and Zhang 
[18], Jeng, Cai and Li [9] and Chan and Walther [4], on fixed-sample change- 
point detection in multiple sequences having a common location index. The 
work here also has connections with detection on spatial indices, see [1, 2, 3]. 

In this paper we show that subject to an average run length constraint, 
a modified version of the MLR stopping rule achieves minimum detection 
delay, extending the classical single-stream optimal detection of Lorden [11], 
Poliak [15, 16] and Moustakides [14] to multiple data streams, in the detec¬ 
tion of normal mean shifts. In Section 2 we provide the asymptotic lower 
bounds of the detection delays for different domains of N. Under the first 
domain for large the lower bound is trivially given by 1. Under the 

second domain for moderate #AA, the lower bound grows logarithmically 
with N. Under the third domain for small , the detection delay grows 
polynomially with N. In Section 3 we show that a MLR stopping rule that 
tests against the limits of detectability achieves optimal detection on all 
three domains. A window-limited rule, suggested in Lai [10], is incorporated 
into the stopping rule for computational savings. In Section 4 a numerical 
study is performed to provide justification for using the MLR stopping rule 
for finite N. In Section 5 we extend the idea of testing against the limits 
of detectability on Mei’s sum of CUSUM test. Rather than summing the 
CUSUM scores as in Mei [13], we suggest instead to sum the detectability 
score transformations of the CUSUM scores. Optimality of this procedure 
is shown but it occurs only when we select the assumed mean shift at a 
specific value between one to two times the true mean shift, surprisingly 
not at the true mean shift itself. In Sections 6-8 we provide the proofs of 
Theorems 1-3. 

2. Detection delay louver bound. Let Xnt, 1 < n < X, t > 1, be 
distributed as independent N(//„i,l). Assume that at some unknown time 
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> 1, there are mean shifts in a subset M of the data streams. More 
specifically we assume that 


(2.1) for some /r > 0, 

with ... ,I{ArgA^} i.i.d. Bernoulli(p) for some 0 < p < 1. We shall let 

Pu (E^) denote probability measure (expectation) with respect to distribu¬ 
tion changes at time u, with u = oo indicating no change. In Appendix B 
we provide an analogue of Theorem 1 below on a minimax formulation of 
the problem, with a constraint on IfnsM} instead of assuming I{neM} 

to be i.i.d. Bernoulli. 

A standard measure of the performance of a stopping rule T, see Poliak 
[15, 16], is the (expected) detection delay 

(2.2) D]\f{T) := sup Ei,(T — ly + 1\T > ly), 

l<u<oo 

subject to the constraint that ARL(T) {:=EooT) > 7 for some 7 > 1. 

In this section we find (asymptotic) lower bounds of Dj\f{T) under the 
conditions that as —)• 00 , 

(2.3) log 7 ~ N‘’ for some 0 < C < I 5 

( 2 . 4 ) p ~ N~^ for some 0 < /3 < 1. 


In Sections 3 and 5, we devise optimal detectability score stopping rules 
that achieve this lower bound. In Theorem 1 below, only (3 > is consid¬ 
ered. For /3 < the detectability score stopping rules achieve asymptotic 
detection delay of 1, and are hence optimal. 

For < /3 < 1 — C, the detection delay lower bound grows logarithmi¬ 
cally with N. The proportionality constant is 


K/3,C) 


/3-Y ifY</5<^> 

(v'W-Vl-C-^)' if^</5<l-C. 


This is a two-dimensional extension of the Donoho-Ingster-Jin constants 
p{f3) := p{P,0), which has appeared in connection with sparse normal mix¬ 
ture detection, see [5, 7, 8]. The extension results from the additional dif¬ 
ficulty of detecting a normal mean shift when there are multiple compar¬ 
isons, here for sequential change-point detection, and in [4] for hxed-sample 
change-point detection. 


Theorem I. Let T be a stopping rule such that ARL(r) > 7, with 7 
satisfying (2.3). 
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(a) If (2.4) holds with < /3 < 1 — then 

(2.5) liminf > 2//"V(/3,C)- 

N^oo log 

(b) If (2.4) holds with /3 > 1 — C, then 


( 2 . 6 ) 


liminf 

N^oo log N 


>/3 + C-l. 


The phase transition between logarithmic and polynomial growth of the 
detection delay boundary is at = N^, that is, at ffAf = logy. By 

Theorem 1(a), for larger the detection delay lower bound grows at a 
logA^ rate. By Theorem 1(b), for smaller ffM the lower bound is roughly 
(fog'j)/ffAf. The detection delay lower bound in the logarithmic domain 
[Theorem 1(a)] is closely linked to the Donoho-Ingster-Jin detection bound¬ 
ary for sparse normal mixture detection, whereas the lower bound in the 
polynomial domain [Theorem 1(b)] lies in the framework of the classical 
lower bound established by Lorden (1971) for N fixed as 7 —)■ 00 . 

We shall first establish the connection between Theorem 1(a) and the 
Donoho-Ingster-Jin detection boundary y/2p{l3) log N. Let t > n > 1 and 
k = t — u + l. lip^ ^ < /3 < 1 , then as 




i=iy 


N(0,1) under P^o, 

(1 — p)N(0,1) + pN{p'/k,l) under P^, 


l<n<N, 


sparse normal mixture detection theory dictates that k should satisfy 

pVk > [1 + o(1)]y^2p(/ 3) logW (i.e. k > [2p~‘^p{/3) o(l)] log iV), 

in order for it to be possible that the sum of Type I and II error probabilities 
goes to zero, when testing P^ against Poo with observations up to time t. 
By (2.2) this leads to 

(2.7) Dn{T) > [2//-V(/3) + o(l)] log N, 


for any stopping rule T satisfying ARL(T) > 7 with 7 /logA^ 00 . What 
Theorem 1(a) says is that under (2.3) with small enough (< 1 — /3), log A" 
detection is still possible with a larger asymptotic constant. 

The link between Theorem 1(b) and the classical lower bound formula of 
Lorden is best established via the inequality in Mei [12, Prop 2.1], that for 
N fixed, 

( 2 . 8 ) Djv(r) > 2 //- 2 ^ + 0 ( 1 ) asy^oo. 
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Theorem 1(b) says that for logy ;$> the right-hand side 

of (2.8) gives the correct order for the attainable detection delay. When 
;:g> logy, the right-hand side of (2.8) does not provide the correct order 
for the attainable detection delay as we have already noted in the previous 
paragraph situations under which a log N detection delay is required. There¬ 
fore the 0(1) in (2.8) is more appropriately 0(log A^), if the dependence on 
N in 0(1) is made explicit. What Theorem 1 also says is that the transition 
is sharp. Once we get out of the classical (log 7 )/(^W) domain, we fall into 
the logA^ domain, there are no intermediate asymptotics. 

3. Optimal detection using detectability score. The detectability 
score stopping rule is motivated by the MLR stopping rules of Xie and 
Siegmund [20]. In their formulation Xie and Siegmund consider firstly the 
ideal situation in which p and p are known. The most powerful test at time 
t, for testing the hypothesis that change-point i/ = s for some s < t, is the 
log likelihood ratio 

N 

Lst ■= where inst = log(l -p + 

n=l 

with k = t- s + 1 and Snst = ELs 

Since the change-point u is unknown, they suggest to maximize i,st over s. 
The unknown p (or more precisely pn) in ^nst is substituted by and 

a small pq is substituted for the unknown p. In summary their stopping rule 
can be expressed as 

(3.1) rxs(po) = inf (t : max ?,st(po) > b|, 

fc=i—s-l-lg/C J 

where i,st{po) = En=i ^nst{po) and 

£nst{Po) = log(l -po+ Znst = Snst/Vk. 

The set JC in (3.1) refers to a pre-determined set of window sizes. By applying 
nonlinear renewal theory, Xie and Siegmund derive accurate analytical ap¬ 
proximations of ARL(r) and Dn{T) for T = Txs(po) and related stopping 
rules. 

Our stopping rule is also a mixture likelihood ratio but based instead on 
the limits of detectability. Let 

N 

(3.2) Tsipo) = inf {t : max ^ giZ^st) > b}, 

I k=t—s-\-l£/C —; J 

n=l 
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where g{z) = log[l +po{Xe^^^‘^ — 1)] and A = 2(-\/2 — 1). Following Lai [10], 
we consider window sizes 

(3.3) JC = {1,... ,ki}D {[r^ki\ : j >1}, ki>l,r>l. 


Theorem 2. Consider stopping rule Ts(po); 0 < Po < 1, with window 
sizes (3.3). If ARL(T5(po)) = 7; then threshold b < log(47^ + 27). In addi¬ 
tion, if (2.3), (2.4) hold and ki/logN 00 , po = c[(log7)/A^]^/^ for some 
c > 0, then the following hold as N — 00 . 

( a ) If /3 < then Dn{Ts{po)) 1 . 

(b) If < ft < 1 — then 


(3.4) 


Dn{Ts{po)) 

logN 


—)• 2/i ^/?(/3,C)- 


(c) If /3 > 1 — C; then 


log Dn{Ts{po)) 
logN 


^/3 + C-l. 


Remarks. Instead of (2.3), we can model 7 growing slowly with N by 
assuming that 


(3.5) 7 /log —)• 00, log 7 = o{N^) for all e > 0. 

Consider the stopping rule Ts{po) with po = cN ~2 for some c > 0. Under 
(2.4) and (3.5), the asymptotic (3.4) holds with C = 0, and the stopping rule 
is optimal in view of (2.7). 


We shall provide some intuition here on the detectability score transfor¬ 
mation g. Consider an i.i.d sample Zi ,..., Zn that is distributed as N(0,1) 
under the null hypothesis Hq. If wn —)• 00 with wn = o{^/\og N), then 
ff{n : Zn > wn}/N is asymptotically normal with mean on and variance 
un/N, where un = Po{Zn > Wn} = J^^{27r)-^/'^e-^^/'^dz. 

Therefore under any alternative hypothesis Hi, is the minimum 

deviation of Pi{Zn > wn} from that is detectable. Since oat is essen¬ 
tially (up to logarithmic terms), the minimum detectable deviation 

is e~'^N/^fyfN. That is, a mixture of N(0,I) and a small pq = cN~^ fraction 
of N(0,2) is at the threshold of detectability. The detectability score trans¬ 
formation g is essentially the likelihood ratio between the mixture with pQ 
fraction N(0,2), and the null distribution. The factor (log 7 )^/^ in the opti¬ 
mal choice of po in the statement of Theorem 2 adjusts for the additional 
difficulty of each detection due to the multiple comparison effects of large 7 . 
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Test 

N = 100 
b 

ARL 

o 

II 

ARL 

max 

12.8 

5041 

15.9 

4930 

Mei 

88.5 (106.8) 

4997 

5640 (8722) 

4909 

Mei(N-i) 

3.48 (9.81) 

4994 

3.03 (8.93) 

4973 

Mei(3iV-^) 

5.02 (9.61) 

4976 

2.31 (6.97) 

5017 

S(N-3) 

4.25 (18.42) 

5066 

14.49 (18.42) 

5121 

S{3N-i) 

6.30 (18.42) 

5195 

17.21 (18.42) 

4986 


Table 1 

Thresholds b for stopping rules calibrated to ARL = 5000. The upper bounds of the 
thresholds, as given in the statement of Theorems 2 and 3, are in brackets. 


It is straightforward to check that the detectability score di^nst) 

in (3.2) is indeed the log likelihood ratio for testing , Z'^ i.i.d. 

N(0,1)'’' [the distribution of Z^ when Z ~ N(0,1)] against the alternative 
that , Z^^^ are i.i.d. 

(1 -po)iV(0,l)+ +po[^HN(0,2) + (1 - ^)<5o], 

where 6q denotes a point mass at zero and HN(0,2) the half-normal distribu¬ 
tion with density on z > 0. The value A = 2(\/2 — 1 ) is chosen 

for convenience, so that g is continuous at 0. The optimality of Ts{po) in 
Theorem 2 does not require the selection of this specific A. 

4. Numerical study. In addition to (3.1), Xie and Siegmund introduce 
the stopping rule 

N 

(4.1) rLR(po) = inf |t : max - kpl/l + logpo)^ > b\. 

n=l 

This like (3.1) is motivated by the most powerful likelihood ratio test, but 
with ft substituted by a pre-determined po rather than Sl^g^/k. It bears 
resemblance to Mei’s stopping rule 

N 

(4.2) Tuei = inf |t : ^ max {poSnst - kpl/2y > ftj, 

k 0<s<t J 

n=l ~ 

with the important difference of an additional logpo term in (4.1) that sup¬ 
presses the contributions of low scoring data streams. 

Another key difference is that the sum lies outside the max in (4.2) 
whereas in Tlr (and Txs, Tg), the sum lies inside the max. This confers 
advantage to Mei’s stopping rule when the change-point (or z/^) differs 
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Test 

1 

3 

5 

10 

30 

50 

100 

max 

25.5 

18.1 

15.5 

12.6 

9.6 

8.6 

7.2 

XS(1) 

52.3 

18.7 

12.2 

6.7 

3.0 

2.3 

2.0 

XS(O.l) 

31.6 

14.2 

10.4 

6.7 

3.5 

2.8 

2.0 

LR(O.l) 

29.1 

13.4 

9.8 

7.1 

4.6 

4.0 

3.4 

LR(1) 

82.0 

27.2 

15.5 

6.8 

3.0 

2.3 

2.0 

Mei 

53.2 

23.0 

15.7 

9.6 

4.9 

3.8 

3.0 

Mei(O.l) 

26.4 

14.6 

10.8 

7.7 

4.5 

3.4 

2.3 

Mei(0.3) 

34.3 

15.9 

11.8 

7.6 

4.1 

3.1 

2.0 

5(0.1) 

26.8 

13.4 

9.6 

6.4 

2.8 

2.0 

1.1 

5(0.3) 

32.6 

14.0 

9.5 

5.6 

2.3 

1.5 

1.0 

s.e. 

0.9 

0.3 

0.1 

0.1 

0.1 

0.1 

0.1 


Table 2 

Detection delays when {out of N = 100) data streams undergo distribution changes. 
Entries in the last row are standard error upper bounds. 


across data streams. We investigate this in Section 5 where we also propose 
an extension of Mei’s stopping rule, denoted by rMei(po)) that like (4.1) 
weighs down the contributions from non-signal data streams. 

In our numerical study, we benchmark the detectability score stopping 
rule against the above stopping rules and the max rule 

(4.3) Tmax = inf{t : max max (Z+^)^2 > b}. 

0<s<t l<n<A 

As in [20], we select N = 100, = 1 and ranging from 1 to 100. The 

thresholds b are calibrated to average run length 5000. The set of window 
sizes chosen is K. = {1,..., 200}, and for Mei’s stopping rule and Tlr we 
select /io = 1- 

We consider po = 0.1(= N~^) for the detectability score stopping rule 
Ts, corresponding to the optimal choice under (3.5). Another selection is 
Po = 0.3{= [(log 7 )/A^]^/^}, which is optimal under (2.3). It is interesting 
that in [20], the “optimal” po = is chosen for Txs and Tlr in the 

numerical study. 

We conduct 500 Monte Carlo trials for the estimation of each average 
run length and detection delay. The thresholds for the stopping rules are in 
Table 1, the detection delays in Table 2. In Table 2 the simulation outcomes 
below the horizontal line are new, the outcomes above are reproduced from 
[20, Table 5]. 

We see that with a few understandable exceptions, the detectability score 
stopping rules r 5 ( 0 . 1 ) and r5(0.3) have smaller detection delays compared 
to their competitors over the full range of This justifies the application 
of the detectability score stopping rules for a relatively small N = 100. 
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Test 

1 

10 

lO’^ 

lO’^ 

10"* 

max 

32.7 

18.6 

13.9 

11.1 

9.4 

Mei 

246.5 

46.7 

12.0 

4.0 

1.0 

Mei(O.Ol) 

39.7 

16.7 

00 

bo 

4.0 

2.0 

Mei(0.03) 

53.7 

18.6 

9.0 

4.0 

2.0 

S(O.Ol) 

37.7 

13.3 

4.5 

1.0 

1.0 

S(0.03) 

49.3 

13.7 

3.9 

1.0 

1.0 

s.e. 

4.0 

0.3 

0.1 

0.1 

0.1 


Table 3 

Detection delays when #A/" (out of N — 10"*^ data streams undergo distribution changes. 
Entries in the last row are standard error upper bounds. 


Following the recommendation of a referee, we conduct a second numerical 
exercise for a larger N = 10^, with ranging from 1 to 10^. As in the 
earlier simulation study, we select fi = fto = 1, ARL = 5000 and 1C = 
,200}. The detection thresholds are in Table 1, the detection delays 
in Table 3. We see again that except for = 1 when Tmax is superior, 
the detection score stopping rules Ts{po) for pQ = 0.01(= N~ 2 ) and 0.03 
{= [(log 7 )/A']^/^} have the smallest detection delays. 

5. Detectability of Mei’s stopping rule. As mentioned earlier there 
is no implicit assumption that the distribution changes occur simultaneously 
when applying Mei’s stopping rule (4.2). Another advantage is the efficient 
recursive computation of the stopping rule. However this recursive compu¬ 
tation comes with the price of information loss. In this section we improve 
Mei’s stopping rule by applying a detectability score transformation on each 
CUSUM score. Due to the information loss, optimality is possible only for 
specific pq. 

Let Rnt be the CUSUM score of the nth detector at time t, satisfying 

(5.1) RnO = 0, Rnt = {Rn,t-1 Po^nt — , t > 1. 

Define 

N 

(5.2) TMei(po) = inf 9M{Rnt) > &}, 

n=l 

with the detectability score transformation 

(5.3) gM{x) = \og[l+pQ{\Me"^‘^-1)], Am > 0. 

This is an extension of Mei’s test, for TMei(l) is equivalent to Tiuei- Let 
^ = limt^oo and define 

DN,k{T) = sup E^{T - + 1\T > u). 

k<v<oo 
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M 

Mei 

Mei(O.l) 

Mei(0.3) 

5(0.1) 

5(0.3) 

0.5 

20.7 

21.2 

20.6 

23.0 

20.7 

0.7 

15.5 

15.4 

15.1 

16.0 

14.9 

1.0 

11.9 

10.9 

11.1 

10.9 

10.4 

1.3 

10.0 

8.7 

9.0 

8.0 

7.9 


Table 4 

Detection delays for staggered distribution changes. The standard errors are not more 

than 0.2. 


Theorem 3. Consider stopping rule TueiiPo), 0 < po < 1- u = 
log[l+po(AM^ —1)]■ -(f ARL(TMei(po)) = 7: then threshold b < Nu+log{4j). 
In addition, if {2.3), (2.4) hold andpo = c[(log 7 )/A^]^/^ for some c> 0, then 
the following hold as N ^ oo. 

(a) If < (d < and fto = 2//, then 


(5.4) 


-PAr,Ji'jv(TMei(Po)) 

log 


—)• 2/i ‘^p{l3, C), 


for Kn = 2/i 2(1 _ ^ _ ^) log N. 

(b) If < /? < 1 — C and po = then (5.4) holds for = 

2li"V(/3,C) logiV. 


Remarks. 1. In Theorem 3 “optimality” occurring when lig > /r is a 
consequence of a small subset of M dominating the score contributions, 
after the detectability score transformations have been applied. 

2. Notice the weaker (5.4) instead of (3.4). The extra initial delay is needed 
for the CUSUM scores Rnx for n 0 Af to reach their stationary values and 
not pull down the total score. In that sense the detection delay criterion may 
be disadvantageous to the extended Mei’s stopping rule (and hence Mei’s 
test stopping rule itself) since in practice we seldom expect the change-point 
V to be that close to 0. 


To highlight the unique characteristics of the extended Mei’s stopping 
rule (5.2) in dealing with staggered change-points, we conduct a numerical 
study with pLnt = in place of (2.1). That is the nth data stream 

undergoes a distribution change at time n. As in Section 4 the stopping 
rules are calibrated to average run length of 5000, for N = 100 detectors, 
and with pQ = 1. The thresholds h for TMei(po) are in Table 1 (Section 4), 
the detection delays in Section 4. We select Am = 0.64, this will be explained 
later. By detection delay we shall mean the expected stopping time when 

hnt — 
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We see from Tables 2 (Section 4) and 4 that TMei(O.l) and TMei(0.3) have 
smaller detection delays compared to TMei, almost uniformly over and 
/i. In Table 3 (for N = 10^), TMei(O.Ol) and rMei(0.03) are superior to TMei 
for < 100. Hence applying detectability score transformations on the 
CUSUM scores improves Mei’s stopping rule in general, the noise suppression 
on data streams that do not undergo distribution change is indeed effective. 
In Table 4 we see that in general Ts{po) performs better than TmbI (Po) 
when /X > 1 but the reverse is true when /i < 1. This is consistent with the 
prediction in Theorem 3 of TMei(po) performing better for fi < fiQ. 

We end this section with explanations of the choice of the detectability 
score transformation (5.3) and choice of Am- It follows from renewal theory, 
see for example Siegmund [17, eq8.49], that 

(5.5) lim Pcxi{Rnt > a;} ~ ae~^ as x —>■ oo, 

£—)-oo 

for a = 2/Xg ^ exp[—2 j~^^(“Po-\/7/2)]. Therefore the tails of Rnt under 

Poo are like that of an i.i.d. sample from Gi := (1 — a)5o + aExp(l), where 
So denotes a point mass at 0 and Exp(0) the exponential distribution with 
mean 

For large x (smaller than logA^) and t, i^{n : Rnt > x}/N is asymp¬ 
totically normal with mean ae~^ and variance ae~^ /N. Hence the mini¬ 
mum detectable difference of P{Rnt P x} is j. The distribution 
at the limit of detectability is therefore G* := {1 — po)Gi + P 0 G 2 , where 
G 2 = (1 — w)(5o -|-a;Exp(2) for some 0 < w < 1, and po is of order N~ 2 . The 
detectability score transformation qm [see (5.3)], with Am = i^(= 0-64 for 
^0 = 1)) is the log likelihood ratio between G* and Gi, with u selected so 
that gM is continuous at 0. We emphasize however that this is for conve¬ 
nience, optimality in Theorem 3 is not restricted to this choice of Am- 

6. Proof of Theorem 1. To help the reader, we summarize below the 
definitions of the probability measures used in the proofs of Theorems 1-3 
in this and the next two sections. 

1. Pg {Eg): This is the probability measure (expectation) under which an 
arbitrarily chosen data stream has probability {1—p) that all observa¬ 
tions are (i.i.d.) N(0,1), and probability p that observations are N(0,1) 
before time s, N(/i,l) at and after time s. In particular, if 

(a) s = 00 , then with probability 1 all observations are N(0,1). 

(b) s = 1, then an arbitrarily chosen data stream has probability 
(1 —p) that all observations are N(0,1), and probability p that all 
observations are N(/i,l). 
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2. P (E): This is the probability measure (expectation) under which 
Y, Yi, Y 2 ,... are i.i.d. N(0,1) random variables. 

We preface the proof of Theorem 1 with the following lemmas. Lemma 1 
is well-known, see for example (3.3) of Lai [10]. 

Lemma 1. Let k >1. If T is a stopping rule such that E^T > 7 , then 
Poo{T > s + k\T > s} > 1 — A :/7 for some s > 1. 


Recall the sum Snst = and the log likelihood ratio 

N 

Lst = X! where inst = log(l -p + P)^ t ^ + 1. 

n=\ 

Lemma 2. If we ean find b{= h^) and k{= k^) sueh that 

(6.1) PooiLik > b}{= PooiLst > b}) > 

( 6 . 2 ) Piii.ik > b}{= Ps{£.st > b}) ^ 0 , 

then > [1 + o(l)]/c for any stopping rule T satisfying E^oT > 7. 

Proof. Let T satishes E^qT > 7, and let b,k satisfy (6.1) and (6.2). By 
Lemma 1 we can find s satisfying 

(6.3) Poo{^ > s + k\T > s} > 1 — /c/y. 

Let P^{-} = P^{-\T > s} and Pffi} = Ps{-\T > s}. 

Let t = s -|- A; — 1, and consider the test, conditioned on T > s, of 

Hq : Xnu ~ N(0,1) for l<n<A^, l<u<t, 
vs Hs : Xnu ~ ^{h^u>s,n&X}, 1) iov I < n < N, 1 < u < t, 
with IjneAT} Bernoulh(p). 

By (6.3) the test “reject Hq if T < s + k, accept Hq otherwise” has Type 
I error probability not exceeding k/'j. By (6.1) the likelihood ratio test 
rejecting Hq when i,st exceeds b has Type I error probability at least fe/y, 
and hence by the Neyman-Pearson Lemma, it is at least as powerful as the 
test based on T. That is 

(6.4) Pfie.st >b}> P:{T <s + k}. 

A key observation here is that the conditioning on {T > s} does not affect 
the distribution of Xnu for u > s under either Hq or Hg- Therefore by (6.4), 

Dn{T) > Eg{T-s + l\T>s)>kPf{T>s + k} 
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> kP:{e,st <b} = kPs{£,st < b}, 
and we conclude Dj^{T) > [1 + o{\)]k from (6.2). □ 

Lemma 3. If k is such that logk = o(A^^) and 

(6.5) Pi{e.ik > 2iVV3} ^ 0, 
then (6.1) and (6.2) follow from selecting b satisfying 

(6.6) Pi{2N</3 > >b} = exp(-iVV4). 

Proof. It follows from (6.5) and (6.6) that (6.2) holds. Moreover since 
i,ik is the log change of measure between Pi and Po© at time k, 

Pooii.ik >b}> Poo{21vV3 > i.ik > b} 

= Pi(e“^*i''I| 2 Arc/ 3 >£.i;,> 6 }) > exp(-2iV^/3)Pi{2A^‘^/3 > i,ik > b}, 

and (6.1) follows from (6.6) since log( 7 /A:) M. □ 

In view of Lemmas 2 and 3, to prove Theorem 1 it suffices to check (6.5) 
for 

/ [{l-d)2gi-^p{(3,C)logN\ ifl^</3<l-C, 

^ ^ ^ I if/3>l-C, 

with (5 > 0 small. Motivations behind the above choices of k are given in 
Appendix A. 

Let Zjik Pnikly/k and 

(6.8) 4fc(= Uk) = log(l + 

Note that Z^k-, 1 < n < N , are i.i.d. N(0,1) under Pqo, and i.i.d. (1 — 
p)N(0,l)+pN(//\/fc,l) under Pi. More specifically, Znk has the distribution 
of P ~ N(0,1) if n 0 AA, and the distribution of T + py/k if n € N. Hence 
conditioned on n 0 AA, I^k has the distribution of 

(6.9) 4 = log(l -p + 

whereas conditioned on n £ M, ink has the distribution of 

(6.10) h = log(l -p + 
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Case 1: < /3 < 1 - C- Let £nk = where 


CON = \l 2(1 - C) log + 2 log log N. 


We shall check on two sub-cases that 


( 6 . 11 ) 

( 6 . 12 ) 

(6.13) 

(6.14) 




sup 

l<n<N 


P+ 

^nk 


El 


^nk 


Pl{Znk > Wat} 


o(iV^-i), 

0 ( 1 ), 

o{N'^~^/log N). 


Note that by (6.14) and niaxi<„< 7 v Z^k = Op{yJ\og N), 


N 

(6.15) ^ inkl{Z^,>.^} = Op{N‘^/V^). 

n=l 


Recall that £,ik = J2n=i^nk and let i,ik = J2n=i^nk- By Chebyshev’s 
inequality and (6.13), 

(6.16) Pi{I,ik -NJ1> A^02} < N-<Ei(i,ik - NJif 

= N-^+^Ei{lr,k - n? < ^ 0 . 

By (6.15), noting that 4ifc - Lik = Y.n=i^nk\z„k>^N}^ 

(6.17) Pi{Uu - I.ik > N^/^/WN] ^ 0. 

It follows from (6.16) and (6.17) that > 6} —)• 0 for 6 = NJi + N^P + 

N'^/^\ogN[= o{N‘^) by (6.11)], hence (6.5) holds. 


Checking (6.11)-(6.14): 

(a) L^ < (3 < and p{^,C) = (3 — By Jensen’s inequality, 

E£o < logi?e^° = 0, therefore to show (6.11), it suffices to show that 

(6.18) pEit = o{N^-^). 

Indeed as log(l -|- x) < x, by (6.7), 

(6.19) pEit < ^ ^2gV ^ ^^^_2/3+(l-5)(2/3-l-hC)^^ 

and (6.18) holds. 


imsart-aos ver. 2011/11/15 file: seq-sparse8.tex date: January 20, 2016 








OPTIMAL SEQUENTIAL DETECION IN MULTI-STREAM DATA 


15 


To show (6.12), note that 

(6.20) sup = ^ga;^/2-(a;Jv-M^/fc)V2 

l<n<A^ 

~ log N. 

Express (3 = -|- a(l — (") for some 0 < a < |. Since p{f3,C) = ce(l ~ C) 

and wn > \/2{l — C) log N, by (6.7) there exists e > 0 small such that 


( 6 . 21 ) 


2 log N 


> 


> 


(1 “ 0(1 “ \/o)^ + e 
(l_C)[(i^%^ + l-a] + e 
(l-C)(5-a) + e 
l-C-/3 + e. 


Substituting (6.21) into (6.20) shows (6.12). 

To show (6.13), note that by (6.19), 

(6.22) E£l = o(p2e2>^MvT-fc/.2^ ^ = o{N^-^). 

Since /3 > 

(6.23) (J-^f = O(p 0 = o(iV^-0, 

and (6.13) follows from (6.12), (6.18) and (6.22). 

Finally to show (6.14), note that P{Y > ujn} = o{N^~^/ log N), and that 
by (6.21), 

(6.24) pP{Y + pVk > un] = 0(iV-/3e-fo^-MVfc)V2) 

= o(A^^“^/log A^). 

(b) < /3 < 1 — C and p{x,y) = {x — y)^, where x = y^l — (, 

y = P- By log(l + v) <v, 

J —oo 

= p2eV$(a;jv-2/rOfc). 

Since ojn ~ Xy/2 log N, pVk = (1 — 6){x — y)yJ2 log N + 0(1) and x > 2y, 
it follows that oj^ < 2py/k for <5 > 0 small, and therefore 

(6.26) -2pVk) = 0(p2gfc/.2-(<^iv-2M^/fc)V2) 

— Q('p2ga;^/2-(i^jv-/2\/fc)2^ 

= 0(iV-2/3+^'-2y=-^) = 
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for some e > 0, since —2/3 -\- — 2y‘^ = C — 1, And since Eio < 0, (6.11) 

follows from (6.25) and (6.26). 

By the first line of (6.20), 

ink < 

for some e > 0, therefore (6.12) holds. 

Note that by (6.9) and log(l + v)<v, 

E(f§I(y<„„)) = 0{p^ 

= 0{p^e^y‘^^{u]N - “^[ky/k)), 

and (6.13) follows from (6.12), (6.23), (6.25) and (6.26). It is easy to check 
that (6.24), and hence (6.14), holds in this sub-case. 

Case 2: /3 > 1 — C and k = . Let 

^ r if n 0 AA, 

[ink ifnCAA. 

Let i,ik = J2n=iink- In place of (6.11) and (6.13), we shall check that for 
5 > 0 small and N large, 

(6.27) Eilnk < N‘^-^/2, 

(6.28) = o(iV2f-i). 

Note that in place of (6.15), we have 

(6.29) Pi{Znk > \/2\ogN for some n ^ M}{= Pi{i,ik > i,ik}) 0. 

It follows from (6.27), (6.28) and Chebyshev’s inequality, see the arguments 
in (6.16), that Pi{i,ik > 2A^^/3} —)• 0, hence (6.5) follows from (6.29). 
Check that 

(6.30) log(l-|-e^) < log2-|-x"*", 
and apply it on (6.10) to show that 

pEii < |?[log 2 -|- i3(logp -|- YpVk + kp"^/2)^] ~ /2. 

Since EIq < 0, (6.27) holds when 5 < 

Since sup„^_yy/ \ink\ ~P, by (6.30), 

< pE(}ogp + YpVk + kfiy2)^ Yaip"^) 

= 0(iV^+2f-2)+0(/V-2/3), 

and (6.28) holds because /3 < 1. 
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7. Proof of Theorem 2. The following lemma provides an upper 
bound for the threshold of the detectability score stopping rule. 

Lemma 4. Consider stopping-rule Ts{po), 0 < po < 1; with arbitrary 
window-sizes 1C. If b = log( 47 ^ -|- 27 ), then EooTs{po) > 7 - 

Proof. It suffices to show that 

(7.1) Poo{Ts{po) < 27 } < 

Let Znst = Snst/Vk, k = t - s 1. Since := J2n=i9{Znst) is a log 
likelihood ratio against Zist, ■ ■ ■, Zj^st i-i.d. N(0,1), it follows from a change 
of measure argument that 

PociVst >b}< e-'^ = (472 + 27 )-^ 

By Bonferroni’s inequality, 

Poo{Ts < 27 } < E PooiVst >h}< V 472 + 27 )-i, 

(s,t):l<s<t <27 V / 

and (7.1) follows. □ 

Assume (2.3), (2.4) and let 7 = rninmeJjv-Pi{En=i ^(^nifc) ^ ^1#-^ = 
m}, where 

(7.2) = {m : |m - Np\ < 

By the Chernoff-Hoeffding’s inequality, 

(7.3) Pi{#A7 ^ Jm} < exp(-2ArC) = 0 ( 7 "^). 

We shall show in various cases below that 7 —)■ 1 when 

f 1 

(7.4) k=l [(l + 727-V(/3,C)logiVj ifY</3<l-C, 

[ MNP+^-^ if /3 > 1 - C, 

for all (5 > 0, and M large. For j > 1 and m e Jn, Pi{Ts{po) > jk-\-l\^J\f = 
m} < (1 — 7)L Hence by (7.3), 

00 

(7.5) Dn{Ts{po)) VV + 7A{#A7 0 Jn} - k, 

j=0 

and the proof of Theorem 2 is complete. 

Let Vn = En=i 9{yft) for W,..., Tw i-i-d. N(0,1). 
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Lemma 5. If po ~ then P{V]sf > —^ 1. 

Proof. Let l>( 2 :) = 4>{y)dy where 4>{y) = g{z) = 

5 ( 2 )I{ 2 <to^} where wn = ^2(1 - C) log N - (log log N)/2. By log(l+x) ~ x 
and poc^n/'^ 0, 

/ '^N o 

(Ae^+/^ - l)(j){z)dz 

-OO 

= civK-‘l/2 [I + A\/2 y™ - •fiwn) 

= - $(u.„)] + AV2|i - i(^)]}. 

Since A = 2{y/2 — 1) solves i + :^ = 1 s-iid ^{wn) < 
by (7.6), 

(7.7) \Em)\=o{N<-^). 

Since 

/ W]\f j 

(Ae^+/^ - lf<i){z)dz = 0{N^-^^/kfiN), 

-OO 

and g >g, we conclude Lemma 5 from (7.7) and Chebyshev’s inequality. □ 

Let h{z) = g{{z + pVk)~^) - g{z~^){> 0) and Hn = T,neJ\rHYn)- Then 
g = min P{VAr + H]\f > 6|#AA = m}. 

m&JM 

In view of Lemmas 4 and 5, to show ?? —)• 1 and hence (7.5), it suffices to 
show that 

(7.8) min P{77w >41V‘^|#AA = m}^ 1. 

mS Jjv 

We shall check (7.8) in three cases below. Note that jfM € Jw implies 
j'NJ For notational simplicity, we shall let C denote a generic 

positive constant. 

Case 0: /3 < and k = 1. Since log(l + x) ~ x as x —>■ 0, 
h{z) ~ cAiV(f-T/2(e(^+/^)V4 _ > cAlv(C-i)/2(e/^V4 _ 

uniformly over 0 < z < 1. Hence by LLN, 

Hn > [C + 
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and (7.8) holds because 1 — (3 + 

Case 1: < /3 < 1 — C and k = [(1 + 5)2/i“^p(/3, C) log A^J. We shall 

show (7.8) for the following sub-cases. 

(a) and p(/?, C) = /3 —For <5 > 0 small and z > ^Vk, 

(7.9) h{z) ~ log[l + ciV(^-i)/2(Ae(^+M^)V4_i)] 

-log[l + ciV(^-^)/2(Ae"V4_i)] 

> [cA + o(l)]iV(f-^)/2g/.2fc_ 

Since P{Yn > nVk} > Ce-'^'^/Wlo^, by (7.9) and LLN, 

Hn > [C + Op(l)]iV^-^+(f-^)/2g/.=fc/2/yj^^ 

> [C + Op(l)]iV^"^+(^"^)/^+^(^’^)+7v^IogiV 

for some e > 0, and (7.8) holds because \ — (3 + -|- /9(/3, Q = C- 

(b) < /3 < 1 — C and p{l3,C) = {x — y)^, where x = y/1 — (, 
y = VI — C — /3- Since p?k = 2(1 -|- 5){x — y)^logiV -|- 0(1), by the first 
relation in (7.9), h{z) > C ioY z> \/2{y'^ — e) log N with e > 0 small. Since 
P{Yn > V2(y^ -ejlogA^} > OlV-^^'+VVIb^, by LLN, 

Hn>[C + Op{l)]N^-P-y"+^/./]^, 

and (7.8) holds because 1 — /3 — = C- 

Case 2: /3 > 1 — C and k = By the first relation in (7.9), for 

z > 0, 

Hz) > ^ + O(logiV) = [^ + o(l)]iV^+^-^ 

By LLN, Hn > + Op(l)]iVf, and (7.8) holds for M > 32/i-2. 

8. Proof of Theorem 3. In Lemma 6 below we provide an upper 
bound of the detection threshold of the extended Mei’s stopping rule, and 
follow this with conditions under which this bound is exceeded under Pjy. 
We complete the proof by checking these conditions for various cases. Let 

50(t) = yM(a;) - where yM(a;) = log[l-Lyo(AMe'^^^ - 1)], 

u = log[l-Lyo(AMC - 1)], 

and ^ = Ihut^oo 
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Lemma 6 . Consider stopping rule TMeiiPo), 0 < Po < U threshold 
b = Nu + log( 47 ), then EaoTueiiPo) > 7- 


Proof. If 6 = Nu + log(47), then 

N 

TueiiPo) = inf : XI 9o{Rnt) > log(47)|. 
n=l 


Let Sj = ^ with Yi i.i.d. N(0,1), and let 

R = sup(^o5'j - j>o/2)- 

j>0 

Let Ri,..., Rn be an i.i.d. sample with the distribution of R. Let 1 < t < 27 . 
Since Rnt is bounded stochastically by Rn, it follows from = 1, a 

change of measure argument and go monotone that 

N N 

^oo{ X 9oiRnt) > log(47)| < p{ X 9oiRn) > log(47)| < ( 47 )"^ 

n=l n=l 


Therefore {rMei(po) < 27 } is a union of no more than 27 events, each with 
probability bounded by ( 47 )“^ under Pqo- We conclude that Poo{ifMei(po) < 
27 } < i. Hence EoaTueiiPo) > 7-0 


Let n > Kn and t = + A: — 1, where k = [(1 + 5)2/r p{f3X) N\ 

for (5 > 0 small. Let Un = PQj2i=u^ni - kpl/2 (< Rnt)- Under Pn, Un ~ 
N{kpo{p “ when n G Af. Theorem 3 follows from 


( 8 . 1 ) 

( 8 . 2 ) 


inf Pn\ X 9o{Rnt) > #Af = m] 1 , 

mejjv J 




inf PJ X 5 o(U+) > #N = m] ^ 1 , 


n&AT 


for some e > 0, with m ~ ^ uniformly over m G Jjv, see (7.2). 

The following lemma provides the framework for showing (8.1) and (8.2). 
Let go{y) = go{y)\y<vr,}, where 

(8.3) uat = (1 — C) log — loglog A^. 

Lemma 7. If t> 4/rg ^(1 — Q log A^ and n 0 J\f, then for all e > 0, 

(8.4) = C + o(1V(^-L/2+^). 
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Moreover if po l)/2 

c > 0, then 

(8.5) [inf5o(y)]^ = 

y>o 

(8.6) sup5o(y) = 0(1)- 

y>o 

Proof. The relation (8.5) follows from 

I inf 5'o(y)| = |9o(0)| = 0{po) = 
y>o 

whereas (8.6) follows from supy>o5o(y) = 9o{vn) = 0(1)- 
By (5.1), we can express 

Rnt — sup [poSnst (^ S -|- \')Pq/2\ 
l<s<t 


Extend {Xnu : rt > 1} to {Xnu ■ —oo < u < oo} by letting Xnu i-i.d. N(0,1) 
under Pqo for n < 0. Fix t and let 

Rl= sup [poSnst-(t -s + 1)pI/2]'^, 

—oo<s<t 


extending the definition of Snst = J2i=s Xni to S < 0. 

Since = limt^oo to show (8.4), it suffices to show that 

(8.7) = o(iV(^-i)/2+^), 

(8.8) = o(iV(^-i)/2+.)_ 

We conclude (8.7) from (5.5) and (8.3). Let Q = supj>^(/xo5'j — jpQ/2) 
and R' = supj>o(ci;5j — ju:‘^pQ/2) for some w > ^. By (5.5), for x > 0, 

Poo{Rn> Rnt,Rn> x} < P{Q > x} < P{R' > LOX + t{u} - Uj'^)pl/2} 


Hence by selecting ui close enough to it follows that 

/ OO 

y^/^Poo{R*n > RnuR*n > x}dx = 0{e-^^^o/^+% 

-OO 


and (8.8) holds for t > 4//q ^(1 — () log A^. □ 

We conclude (8.1) from (8.4), pQ ~ and LLN. We note that 

indeed t{= n + k — 1) > 4/rQ^(l — f) logN when 
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(a) < /3 < p(/3,C) = 13- V-’ /^o = 2 ^ 1 , u > 2// ^ _ 

/3)logiV, 

(b) < /3 < 1 - c, /^o = ^ 0 log ^■ 

It remains for us to check (8.2) on: 

(a) < j3 < . Since fiQ = 2/x, we have Un ~ N(0,/c/ig) when 

n £ M. Hence 

(8.9) E,[go{U+)\n £ Af] 

~ Po[E^{e^^/\un<VN}\'^ e AA) - C] 

7-00 ^2^ 

= cAhC-l)/2cfcM^/8^/ ^iV-fcM§/2 N 

' AtoVfc ' 

Check that > A^/^+(C“i)/2+2e for e > 0 small and N 

large. Moreover p{P,C) < therefore vn > 4(1 + h)p(/3, C) log A^] 

for h > 0 small. Hence by (8.9), 

(8.10) E,[go{U;t)\n G AA] ~ [c + o{l)]N^+<^-^+^^. 

By (8.5), (8.6) and (8.10), we can conclude 

EA9i{U;t)\n £Af] = Oi\E,[UU^)\n £ AA]|), 

and (8.2) then follows from (8.10), Chebyshev’s inequality and go >go- 

(b) < /3 < 1 — C- For n £ J\f, express Un = kpoip — ^) + VkpoYn, 

with Yn ~ N(0,1). Let A/i = {n G AA : > y^2(1 — ( — /3 — 2e) log N} for 

e > 0 satisfying 

(8.11) l-C-/3-2e> (l-C-/3)/(l + <5). 

By LLN, 

(8.12) #AAi = {#Af)[C + Op{l)]N^+^-^+^yV]^ 

= [c + op{i)]N‘^+^y./[^. 

Let r = po/p{= < 2). Since r - 1 = by (7.4) and (8.11), 

for n £ Ml with N large, 

Un > kp?{r-'^)+ pr{r - i)^ 2fcp(AygjV 
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= 2piP, C) log N[il + 6){r - 4) + - r] + 0(1) 

> ?’^p(/3,C)logiV = (1 - C)logA^, 

9o{Un) > log[l - 1)] - log[l +Po(C - 1)] log(l + c). 
Hence by (8.12), 

E 9o{U^) > [C + Op{l)]N‘^+^yV^- 

nSA/i 

This, combined with 

E 9oiUn) > -[C + Op{l)]N^~^ logpo 

n&M\Afi 

~ -[0 + Op(l)]iVi-^+(^-i)/2^ 

and noting that 1 — j3 + < 1 — |(1 — C) < C; shows (8.2). 
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APPENDIX A: MOTIVATIONS BEHIND (??) 

In view of the need to satisfy (6.5), we choose k to be the “largest” possible 
such that 

(A.l) EiUk < 2iVV3. 

Under Pi, Znk ~ (1 — p)N(0,l)+pN(//\/fc, !)• Let Y ~ N(0,1). Since 
(A.2) log(l + x)<x, 

it follows that 

(A.3) {= N{l-p)E\og[l+p{e^^^^^-^^^^l^-1)] 

+ApPlog[l + _ 1 )]} 

< ApPlog[l + ^(e^A^Vfc+fcMVs _ 

Case 1(a): < /3 < It follows from applying (A.2) on (A.3) that 

(A.4) < Np^{e^>^^ - 1) ~ 

Hence choosing k = [(1—(5)/x“^(2/3-t-C~l) log AJ as in (6.7) ensures Pi^,ifc = 
o(A^), and so (A.l) holds. 
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Case 1(b): < /3 < 1 —C- The inequality in (A.4) is further sharpened 

to allow for larger k satisfying (A.l). Let oo be the root of 

(A.5) ^u^^^Vk+k^V2 ^ 

By (A5) applying the inequalities 

(A,6) log(l + :^)<jf,,, 

1 log 2 +log X if X > 1, 

on (A.3) results in 

= ii^fk) + 0{Npke-^‘'/^) 

= 0{Npke~^^ 

By (A.5), 

(A.8) ujpy/k + /c/i^/2 = /3 log N{^ pVk = —oj + \J+ 2j3 log A^), 
and by (A.7), we satisfy (A.l) if 

(A.9) (1 — /3) log — ti;^/2 < C log N{^ uj > \J2{1 — (3 — Q) log A^). 

Combining (A.8) and (A.9) leads to A: < 2p~‘^{y/l — C — \/l “ C “ /3)^ log N. 
Hence the choice of fc = [{l — 6)2p~‘^{y/l — C“\/(l “ C “ /3)^logA^J in (6.7). 

Case 2; /3 > 1 — C- By (A.3) and (A.6), choosing k = as in 

(6.7) ensures that 

Eil.ik < [1 + o{l)]NpE{YpVk + kp^/2) ~ 6p^N^/2, 
and (A.l) indeed holds for <5 > 0 small. 

APPENDIX B: MINIMUM DETECTION DELAY UNDER THE 

MINIMAX SETTING 

Let Iat = (Ii,... ,/Ar), where In = I{neY}- Tet E^j^ denote expectation 
with respect to Xnt ~ N(/i„t, 1), with pnt = k'^rJ-{t>u}- Tor a given stopping 
rule T, define 

DN,m{T) = sup max Eyj^(r - u + 1\T > v) . 
l<!g<oo I]v:^ U=m 

The following is an analogue of Theorem 1 on a minimax setting. 
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Theorem 4. Let T be a stopping rule such that ARL(T) > 7 , with 
log 7 for some C > 0. Let m ~ ^ for some 0 < ft < 1. 

(a) If < P < 1 — C, then 

N-i-oo log A 

(b) If P > 1 — (, then 


lim inf 

A—>CXD 


log DN,m{T) 
logN 


>/? + C-i. 


Proof. Let k be chosen as in (6.7). By Lemma 1 we can find s > 1 such 
that 

(B.l) Poo{T > s -|- k\T > s} > 1 — k/'j. 

Let t = s -|- A: — 1, and consider the test, conditional on T > s, of 

Hq : Xnu ~ N(0,1) for l<n<iV, l<u<t, 
vs : Xnu ~ X{fll[n>s,neJ\f}, 1) for 1 < n < iV, 1 < tt < t, 
with M a random subset of {1 ,... , A^} of size m. 

By (B.l) the test rejecting Hq when T < s + k has Type I error probability 
not exceeding k/j. 

Let Aj = {Af : ffN = j}. At time t, the (conditional) likelihood ratio 
between Hg^m and Hq is Lm{= Lmst), where 

L = f") ' T (n z„=z„,„ 

V -^ / AfeAj neAf 

Let Ps,m {Es,m) denote probability (expectation) with respect to 
We shall check on various cases below that 

(B.2) Ps^m{Lm > J} 0, J = exp(2iV‘^/3). 

Let B be such that Ps,m{J P Lm P B} = exp(—A^^/4). It follows from 
(B.2) that 

(B.3) Ps^rn{Lm > B}(= Ps^rn{Lm P B\T > s}) —^ 0, 
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and that for N large, 


(B.4) Poo{Lm > B}{— Pcxi{Lm > B\T > s}) > Poo{«^ ^ Lm > B} 

= > J-i exp(-lvV4) > k/^. 

By (B.l), (B.4) and the Neyman-Pearson Lemma, the test rejecting LTq when 
Lm > B is at least as powerful as the one based on T, that is 

(B.5) Ps,m{T >s + k\T>s}> Ps,m{Lm < B}. 

It follows from (B.3) and (B.5) that 

BN,m{T) > — s + l|r > s) > kPs^rn{T > s + k\T > s} = k\l + o(l)], 

and the proof of Theorem 4 is complete. □ 


We shall now proceed to check (B.2). Let pi = 2N ^ and 

N ^ /'n\ 

(B.6) L(pi) = = .]l 

n=l j=0 \^ / 

Since ~ N(/i\/fc, 1) if n G AA and ~ N(0,1) if n 0 M, it follows that 


p Znii\/k—kii'^/2 — J ^ ^ if n G Af, 

I 1 ifn^W. 


Therefore by (B.6), 

(B.7) Es,mL(pi) = (l-pi+pie^>^"r, 

the exponent m in (B.7) due to #JV = m for each A7 under Pls,m- By the 
monotonicity Eg^mLi < • • • < Eg^mLN, and by P{IT > m} —)■ 1 for IT ~ 
Binomial(iV,pi), it follows from (B.6) that 

(B.8) Es,mL{pi) > P{W > m}Es,mLm = [1 + 0{l)]Es,mLm- 

By (B.7), (B.8) and Markov’s inequality, to show (B.2) it suffices to show 
that 

(B.9) (1 - Pi +pie^^')™ = o(exp(21vV3)), 

and this can be easily done for the following cases. 
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Case 1(a): < /3 < ^ ~ L(1 ^(2/3 + C — 1) log A^J. We show 

(B.9) by applying the inequality 

(1 - Pi + < exp(mpie^^'). 

Case 2: /3 > 1—C) k = , 5 > 0 small. We show (B.9) by applying 

the inequalities (for large N), 


{1-pi+pie’^f^'r < {2pie^>^"r < 


The final case below is more complicated. Additional truncation argu¬ 
ments are needed to show (B.2). 

Case 1(b): < f3 < I — C, k = [{I — 6)2p~‘^{x — y)‘^logN\, where 

X = and y = y'l — Q — j3. The outline of the arguments needed to 

show (B.2) is as follows. 

1. Let Zn = min(Z„,a;), where 

U}{= ujn) = ^2(1 - C) log iv -I- 2 log log N{= xy^2\ogN). 

Let Pi = 2N~^ and 


N 


(B.IO) L(pi) = ^(1 - Pi 


Show that Es^mL{pi) = o(J^/^)[= o(exp(A^^/3))]. 

2. Argue that we have monotonicity Eg^mLi < • • • < Eg^mLN, where 



and conclude that 



3. Let C > 0 and = Lmi-G-, where G{= Gn) is the event that 


max Zn < Ci/logiV, Ejy := #{n : Zn > ui} < N'^/{logN)^^^. 


Show that uniformly under G, 


max ( Y[ = o{J^/^)[= o{exp{N</3))], 

MeA 


max 


and conclude that Lm/Lm = o(J^/^). 
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4. Show that for C large, Ps,m{GN) 1 and so Ps^m{Lm > -^m} ^ 0. 

By steps 1, 2 and Markov’s inequality, Ps^m{Lm > —)• 0. By step 3 we 

can further conclude that Ps^m{Lm > >/} —t 0, and (B.2) then follows from 
step 4. We shall now provide details to the above outline. 

1. If n 0 J\f, then and if n G J\f, then 

12 ^ - 2/iVfe) + [1 - 4>(a; - 

Since = m for each M under Hg^m, by (B.IO), 

Es,mL{pi) < [l+Pio{N-"-^y")r < exp[mpio(iV"'-2^')] = 

2. The monotonicity follows from stochastically larger when n G W 
compared to when n 0 J\f, whereas the inequality in (B.ll) follows 
from the monotonicity and the expansion 

j=o J 

3. Under G, there exists C > 0 not depending on N such that for all 

C '^m,i 

n < exp(FjvCv/k^//Vfc) 

nSAT 

s e='p((i;i%ii7i'ClogJV) =o(A^). 

4. Let $(•) = 1—<!>(•). We apply Markov’s inequality to show Ps,miGN) 1 
by checking that 

(B.12) m^{uj - fiVk) + {N - m)^{uj) = 

and that for C large, 

(B.13) m^{G^yiogN — fiVk) + {N — m)^{C\/\ogN) —)• 0. 

By Mill’s inequality, (B.13) holds for C large and N^{u}) = o( (iog^) 5/4 ) • 
Moreover 


limsup log 7 v[?T^‘h(a; — fiVk)] <1 — 13 — y‘^ = (, 

N^OO 

and so (B.12) holds as well. 
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